First, install the requirements:
This is included in the zip file, however they are listed here for transparency:
Next, start working relative to the jplspec (for example if you were working on a project called myproject.py):
.
├── jplspec
│  ├── __init__.py
│  ├── molecules.py
│  ├── setcover.py
│  ├── spectraldata.py
│  └── utility.py
├── project.py
├── README.md
└── requirements.txtNow we can start running the code!
First, we import the library:
Now we can reference the file with the SpectralFile class, and then read it into a SpectralData object:
path ="Titan/Titan/Win0.clean1.contsub_Jy.rest.scom.c.txt"
myfile = jpl.SpectralFile(path)
print(myfile)#> SpectralFile(file_location='Titan/Titan/Win0.clean1.contsub_Jy.rest.scom.c.txt', name=None, time=None, date=None, temp=None, units='GHz')
#> SpectralData(file_location='Titan/Titan/Win0.clean1.contsub_Jy.rest.scom.c.txt', name=None, time=None, date=None, temp=None, units='GHz', data=array([[ 2.2147966e+02, 2.7446039e-03],
#> [ 2.2148015e+02, -1.5861880e-02],
#> [ 2.2148064e+02, -1.2065790e-02],
#> ...,
#> [ 2.2335333e+02, -1.0672529e-03],
#> [ 2.2335382e+02, -1.3607520e-03],
#> [ 2.2335431e+02, -2.9259811e-03]], dtype=float32), frequency=array([221.47966, 221.48015, 221.48064, ..., 223.35333, 223.35382,
#> 223.35431], dtype=float32), intensity=array([ 0.0027446 , -0.01586188, -0.01206579, ..., -0.00106725,
#> -0.00136075, -0.00292598], dtype=float32), num_entries=3840)
Note we can wrap this all in one step with:
Please also note optional keyword arguments: name, temperature, time, data, units. These are not necessary, however they may be useful from an organizational perspective. All attributes of all objects in the jplspec library can be accessed via object.attribute
Now we can identify the spikes!
Identified using a three sigma threshold. This can be modified by writing a custom function that returns indices of true cases using np.where, and using the spike_method argument
#> Spike(data=array([[ 2.2147966e+02, 2.7446039e-03],
#> [ 2.2148015e+02, -1.5861880e-02],
#> [ 2.2148064e+02, -1.2065790e-02],
#> ...,
#> [ 2.2335333e+02, -1.0672529e-03],
#> [ 2.2335382e+02, -1.3607520e-03],
#> [ 2.2335431e+02, -2.9259811e-03]], dtype=float32), frequency=array([221.52263, 221.54118, 221.54167, 221.59442, 221.62518, 221.62567,
#> 221.62616, 221.6286 , 221.6325 , 221.69305, 221.7248 , 221.76385,
#> 221.76434, 221.98653, 221.98897, 221.98946, 222.01389, 222.01437,
#> 222.01486, 222.01633, 222.01682, 222.0173 , 222.0588 , 222.06076,
#> 222.06125, 222.08371, 222.0842 , 222.09494, 222.09543, 222.09592,
#> 222.09836, 222.09885, 222.09933, 222.09982, 222.10031, 222.1008 ,
#> 222.10129, 222.10178, 222.10715, 222.10764, 222.10959, 222.11203,
#> 222.11252, 222.11642, 222.11838, 222.11887, 222.1218 , 222.1257 ,
#> 222.12619, 222.12668, 222.12717, 222.12766, 222.12814, 222.12863,
#> 222.12912, 222.12961, 222.1301 , 222.13058, 222.13107, 222.13156,
#> 222.13205, 222.13351, 222.13547, 222.13596, 222.1384 , 222.13889,
#> 222.13937, 222.13986, 222.14035, 222.14084, 222.14133, 222.14182,
#> 222.1423 , 222.14279, 222.14719, 222.14767, 222.14816, 222.14865,
#> 222.14914, 222.14963, 222.15012, 222.1506 , 222.1511 , 222.15158,
#> 222.15207, 222.15256, 222.15305, 222.15353, 222.15402, 222.15549,
#> 222.15695, 222.15744, 222.15793, 222.15842, 222.1589 , 222.1594 ,
#> 222.15988, 222.16037, 222.16086, 222.16135, 222.16183, 222.16232,
#> 222.16281, 222.1633 , 222.16379, 222.16428, 222.16476, 222.16525,
#> 222.16574, 222.16623, 222.16672, 222.1672 , 222.1677 , 222.16818,
#> 222.16867, 222.16916, 222.16965, 222.17014, 222.17062, 222.17111,
#> 222.1716 , 222.17209, 222.17892, 222.18576, 222.18625, 222.18674,
#> 222.21164, 222.23509, 222.23558, 222.32494, 222.32542, 222.37572,
#> 222.45679, 222.45728, 222.46118, 222.58374, 222.58423, 222.62524,
#> 222.62573, 222.63257, 222.63306, 222.76344, 222.78249, 222.79079,
#> 222.81422, 222.91776, 222.91824, 222.91873, 223.00027, 223.00076,
#> 223.13457, 223.2459 , 223.24638, 223.24687, 223.24736, 223.34161],
#> dtype=float32), intensity=array([0.03284835, 0.03877318, 0.03288539, 0.03720951, 0.05853187,
#> 0.08491591, 0.06331633, 0.03301192, 0.03139719, 0.0336207 ,
#> 0.03562813, 0.04591975, 0.03429356, 0.03502963, 0.03396192,
#> 0.03818281, 0.03267767, 0.04255673, 0.03437934, 0.04328281,
#> 0.04266202, 0.03136031, 0.03447701, 0.04550486, 0.04848193,
#> 0.04093966, 0.03400404, 0.03151353, 0.04374534, 0.03124158,
#> 0.03469666, 0.05565396, 0.07425053, 0.07161437, 0.06504935,
#> 0.07152496, 0.06822967, 0.04492994, 0.03994365, 0.04085354,
#> 0.03125374, 0.03184366, 0.0321136 , 0.03139608, 0.03370965,
#> 0.03504742, 0.03116627, 0.03192464, 0.04326647, 0.04641563,
#> 0.05211566, 0.07108778, 0.09229068, 0.1230322 , 0.1345236 ,
#> 0.09091274, 0.05299176, 0.05250734, 0.05965727, 0.05811873,
#> 0.04186113, 0.03500009, 0.03322943, 0.0339509 , 0.03623006,
#> 0.04652108, 0.03770222, 0.03305039, 0.03956946, 0.04971356,
#> 0.05607868, 0.0540103 , 0.051415 , 0.03203447, 0.03607384,
#> 0.04520367, 0.04587346, 0.04407803, 0.0533353 , 0.09147322,
#> 0.1242162 , 0.107777 , 0.08164912, 0.06181676, 0.04563434,
#> 0.03779434, 0.03900097, 0.05082107, 0.05046627, 0.03096247,
#> 0.03386702, 0.03802484, 0.0368301 , 0.04243016, 0.04828916,
#> 0.04550148, 0.03783446, 0.04280617, 0.06605002, 0.07371385,
#> 0.07734492, 0.117156 , 0.1346142 , 0.09296115, 0.06307497,
#> 0.0562511 , 0.05211168, 0.05525983, 0.0670226 , 0.086752 ,
#> 0.1234378 , 0.1282079 , 0.08913841, 0.06674053, 0.05793724,
#> 0.04502389, 0.04376596, 0.04502972, 0.04501087, 0.05224849,
#> 0.06165801, 0.05180327, 0.03267532, 0.03184385, 0.03972083,
#> 0.03147551, 0.03246397, 0.04341188, 0.03356467, 0.04105018,
#> 0.03546823, 0.03776142, 0.03295065, 0.03648829, 0.03396528,
#> 0.03944029, 0.04025874, 0.03840566, 0.04196307, 0.03797742,
#> 0.03744666, 0.03322563, 0.03087771, 0.04028279, 0.03558374,
#> 0.04901849, 0.06510025, 0.04805615, 0.03715763, 0.04285347,
#> 0.03412285, 0.03438776, 0.0381277 , 0.03895236, 0.03554877,
#> 0.0309805 ], dtype=float32), index=array([ 88, 126, 127, 235, 298, 299, 300, 305, 313, 437, 502,
#> 582, 583, 1038, 1043, 1044, 1094, 1095, 1096, 1099, 1100, 1101,
#> 1186, 1190, 1191, 1237, 1238, 1260, 1261, 1262, 1267, 1268, 1269,
#> 1270, 1271, 1272, 1273, 1274, 1285, 1286, 1290, 1295, 1296, 1304,
#> 1308, 1309, 1315, 1323, 1324, 1325, 1326, 1327, 1328, 1329, 1330,
#> 1331, 1332, 1333, 1334, 1335, 1336, 1339, 1343, 1344, 1349, 1350,
#> 1351, 1352, 1353, 1354, 1355, 1356, 1357, 1358, 1367, 1368, 1369,
#> 1370, 1371, 1372, 1373, 1374, 1375, 1376, 1377, 1378, 1379, 1380,
#> 1381, 1384, 1387, 1388, 1389, 1390, 1391, 1392, 1393, 1394, 1395,
#> 1396, 1397, 1398, 1399, 1400, 1401, 1402, 1403, 1404, 1405, 1406,
#> 1407, 1408, 1409, 1410, 1411, 1412, 1413, 1414, 1415, 1416, 1417,
#> 1418, 1432, 1446, 1447, 1448, 1499, 1547, 1548, 1731, 1732, 1835,
#> 2001, 2002, 2010, 2261, 2262, 2346, 2347, 2361, 2362, 2629, 2668,
#> 2685, 2733, 2945, 2946, 2947, 3114, 3115, 3389, 3617, 3618, 3619,
#> 3620, 3813]), spike_method=<function three_sigma_spike at 0x7fbc3deefcb0>, units='GHz')
To adjust the spike identification method, we write a function which returns a boolean value. We have 4 attributes to play with, intensity, frequency, std, and mean. These are all optinally used, so we can write spike identification methods with whatever we want. As an example, if we are looking for two sigma events, we could write:
Next, we query the spectroscopy datasets for our spikes:
Note this will only take a long time once if we use the save_path argument. This saves the molecule information in a standard python dictionary.
Finally, we can score all the molecules and generate the sets!
#> Restructuring data...
#>
#>
identifying matches...: 0%| | 0/67 [00:00<?, ?it/s]
identifying matches...: 1%|1 | 1/67 [00:01<01:53, 1.72s/it]
identifying matches...: 3%|2 | 2/67 [00:04<02:13, 2.06s/it]
identifying matches...: 4%|4 | 3/67 [00:06<02:02, 1.92s/it]
identifying matches...: 6%|5 | 4/67 [00:07<01:44, 1.65s/it]
identifying matches...: 7%|7 | 5/67 [00:09<01:45, 1.71s/it]
identifying matches...: 9%|8 | 6/67 [00:09<01:15, 1.23s/it]
identifying matches...: 10%|# | 7/67 [00:09<01:02, 1.04s/it]
identifying matches...: 12%|#1 | 8/67 [00:10<00:58, 1.01it/s]
identifying matches...: 13%|#3 | 9/67 [00:11<00:48, 1.19it/s]
identifying matches...: 15%|#4 | 10/67 [00:12<00:56, 1.01it/s]
identifying matches...: 16%|#6 | 11/67 [00:15<01:31, 1.63s/it]
identifying matches...: 18%|#7 | 12/67 [00:16<01:23, 1.52s/it]
identifying matches...: 19%|#9 | 13/67 [00:17<01:01, 1.13s/it]
identifying matches...: 22%|##2 | 15/67 [00:26<01:51, 2.14s/it]
identifying matches...: 24%|##3 | 16/67 [00:26<01:26, 1.70s/it]
identifying matches...: 25%|##5 | 17/67 [00:26<01:02, 1.25s/it]
identifying matches...: 27%|##6 | 18/67 [00:27<00:55, 1.13s/it]
identifying matches...: 28%|##8 | 19/67 [00:27<00:39, 1.22it/s]
identifying matches...: 30%|##9 | 20/67 [00:31<01:12, 1.53s/it]
identifying matches...: 31%|###1 | 21/67 [00:33<01:29, 1.96s/it]
identifying matches...: 33%|###2 | 22/67 [00:36<01:40, 2.23s/it]
identifying matches...: 34%|###4 | 23/67 [00:38<01:25, 1.95s/it]
identifying matches...: 36%|###5 | 24/67 [00:40<01:27, 2.04s/it]
identifying matches...: 37%|###7 | 25/67 [00:41<01:07, 1.61s/it]
identifying matches...: 39%|###8 | 26/67 [00:41<00:55, 1.36s/it]
identifying matches...: 40%|#### | 27/67 [00:43<00:56, 1.42s/it]
identifying matches...: 42%|####1 | 28/67 [00:46<01:11, 1.84s/it]
identifying matches...: 43%|####3 | 29/67 [00:46<00:52, 1.39s/it]
identifying matches...: 45%|####4 | 30/67 [00:48<00:59, 1.62s/it]
identifying matches...: 46%|####6 | 31/67 [00:48<00:43, 1.21s/it]
identifying matches...: 48%|####7 | 32/67 [00:53<01:14, 2.12s/it]
identifying matches...: 49%|####9 | 33/67 [00:53<00:51, 1.52s/it]
identifying matches...: 51%|##### | 34/67 [00:53<00:37, 1.14s/it]
identifying matches...: 52%|#####2 | 35/67 [00:54<00:34, 1.08s/it]
identifying matches...: 54%|#####3 | 36/67 [00:55<00:32, 1.04s/it]
identifying matches...: 55%|#####5 | 37/67 [00:56<00:35, 1.20s/it]
identifying matches...: 57%|#####6 | 38/67 [00:59<00:49, 1.72s/it]
identifying matches...: 58%|#####8 | 39/67 [01:05<01:18, 2.80s/it]
identifying matches...: 60%|#####9 | 40/67 [01:08<01:22, 3.06s/it]
identifying matches...: 61%|######1 | 41/67 [01:09<00:59, 2.29s/it]
identifying matches...: 64%|######4 | 43/67 [01:10<00:42, 1.77s/it]
identifying matches...: 66%|######5 | 44/67 [01:11<00:36, 1.60s/it]
identifying matches...: 67%|######7 | 45/67 [01:26<02:02, 5.56s/it]
identifying matches...: 69%|######8 | 46/67 [01:50<03:52, 11.05s/it]
identifying matches...: 70%|####### | 47/67 [01:51<02:39, 7.99s/it]
identifying matches...: 72%|#######1 | 48/67 [01:52<01:55, 6.10s/it]
identifying matches...: 73%|#######3 | 49/67 [01:53<01:20, 4.50s/it]
identifying matches...: 75%|#######4 | 50/67 [01:54<01:00, 3.54s/it]
identifying matches...: 76%|#######6 | 51/67 [02:21<02:48, 10.56s/it]
identifying matches...: 78%|#######7 | 52/67 [02:22<01:52, 7.49s/it]
identifying matches...: 79%|#######9 | 53/67 [02:24<01:21, 5.81s/it]
identifying matches...: 81%|######## | 54/67 [02:24<00:54, 4.22s/it]
identifying matches...: 82%|########2 | 55/67 [02:27<00:45, 3.78s/it]
identifying matches...: 84%|########3 | 56/67 [02:37<01:03, 5.73s/it]
identifying matches...: 85%|########5 | 57/67 [02:40<00:49, 5.00s/it]
identifying matches...: 87%|########6 | 58/67 [02:43<00:38, 4.30s/it]
identifying matches...: 88%|########8 | 59/67 [02:44<00:27, 3.41s/it]
identifying matches...: 90%|########9 | 60/67 [02:55<00:39, 5.69s/it]
identifying matches...: 91%|#########1| 61/67 [02:56<00:24, 4.11s/it]
identifying matches...: 93%|#########2| 62/67 [02:59<00:18, 3.72s/it]
identifying matches...: 94%|#########4| 63/67 [02:59<00:11, 2.76s/it]
identifying matches...: 96%|#########5| 64/67 [03:02<00:07, 2.63s/it]
identifying matches...: 97%|#########7| 65/67 [03:02<00:04, 2.01s/it]
identifying matches...: 99%|#########8| 66/67 [03:03<00:01, 1.71s/it]
identifying matches...: 100%|##########| 67/67 [03:05<00:00, 1.79s/it]
identifying matches...: 100%|##########| 67/67 [03:05<00:00, 2.77s/it]
#>
identifying sets...: 0%| | 0/156 [00:00<?, ?it/s]
identifying sets...: 15%|#4 | 23/156 [00:00<00:00, 224.62it/s]
identifying sets...: 30%|### | 47/156 [00:00<00:00, 216.83it/s]
identifying sets...: 37%|###7 | 58/156 [00:00<00:01, 90.37it/s]
identifying sets...: 46%|####5 | 71/156 [00:00<00:00, 89.27it/s]
identifying sets...: 51%|#####1 | 80/156 [00:01<00:01, 44.30it/s]
identifying sets...: 60%|#####9 | 93/156 [00:01<00:01, 54.97it/s]
identifying sets...: 65%|######5 | 102/156 [00:01<00:01, 49.06it/s]
identifying sets...: 71%|####### | 110/156 [00:02<00:02, 15.80it/s]
identifying sets...: 74%|#######4 | 116/156 [00:03<00:03, 13.27it/s]
identifying sets...: 78%|#######7 | 121/156 [00:03<00:02, 16.77it/s]
identifying sets...: 85%|########4 | 132/156 [00:03<00:01, 21.97it/s]
identifying sets...: 93%|#########2| 145/156 [00:03<00:00, 29.00it/s]
identifying sets...: 100%|##########| 156/156 [00:03<00:00, 41.34it/s]
scored_molecules = result.likeliest_molecules()
scored_sets = result.likeliest_sets()
pprint.pprint(scored_molecules)#> [('Cyanobutadiynylide anion', 1.0),
#> ('Hydroxyl radical', 1.0),
#> ('Methyl Cyanide', 1.0),
#> ('Ammonia', 1.0),
#> ('Propane', 0.9999999999583432),
#> ('Methyl Acetylene', 0.9161703896648132),
#> ('Ethyl Cyanide', 0.8099832333355084),
#> ('Formamide', 0.6109420482708394),
#> ('Vinylamine', 0.6082647521369084),
#> ('<i>anti</i>-Ethanol', 0.42540923657029633),
#> ('Aminoethanol', 0.2857142277663929),
#> ('Pyrrole', 0.22772082801623222),
#> ('α-Alanine', 0.20958550457779118),
#> ('Vinylcyanoacetylene', 0.17290649126828636),
#> ('Benzonitrile', 0.15903824640153372),
#> ('Peroxynitric acid', 0.15803021057632077),
#> ('n-Butyronitrile, gauche/anti combined', 0.14009449852619932),
#> ('Glycolaldehyde', 0.1384634450891849),
#> ('Chlorine nitrate', 0.098571200733809),
#> ('Propenoic acid', 0.09555601219383479),
#> ('Hydrogen peroxide', 0.09037408118828617),
#> ('Ethylene Glycol', 0.08772099801492059),
#> ('Carbonyl cyanide', 0.07917975457390376),
#> ("1,2-propanediol, g'G'g", 0.07065886451173851),
#> ('Cyanoethenone', 0.06561668770664496),
#> ('Chlorine dioxide', 0.062289741259311845),
#> ('Bromine Dioxide', 0.05858545985438459),
#> ('Acetone', 0.05512240556141538),
#> ('Anisole', 0.05302893229596831),
#> ('Diketene', 0.04696095467958493),
#> ('n-Butyl cyanide', 0.04340865406312234),
#> ('Tetrasulfur', 0.03850043867000284),
#> ('Methyl Formate', 0.03249443426799928),
#> ('Phenol', 0.032381647962831904),
#> ('gauche-n-butyronitrile, v<sub>29</sub> = 1', 0.027664800540490163),
#> ('gauche-n-butyronitrile, v<sub>30</sub> = 2', 0.02625203366729838),
#> ("1,3-propanediol, gGG'g", 0.024132415847265012),
#> ('Aminoacetonitrile', 0.023838424529676546),
#> ('anti-n-butyronitrile, v<sub>30</sub> = 2', 0.01925216706224276),
#> ('Phenylacetylene', 0.015298323848510771),
#> ('Methyl formate, v<sub>t</sub> = 0, 1', 0.015255375826319039),
#> ('Chlorine peroxide', 0.011370695145544453),
#> ('Glycine', 0.010325526430301914),
#> ("1,2-propanediol, aG'g", 0.00969158152531432),
#> ('Ethyl cyanide', 0.004260061548544715),
#> ('Hydroxyacetonitrile', 0.004011715870671322),
#> ('anti-n-butyronitrile, v<sub>29</sub> = 1', 0.003517007235251116),
#> ('2-Cyanobutane', 0.0033249337049754374),
#> ('Sulfuric acid', 0.0026797163812313967),
#> ("1,2-propanediol, gG'a", 0.001150130377220704),
#> ('<i>cyclo</i>-Propyl cyanide', 0.0011417077711069284),
#> ('3-Methylbutyronitrile', 0.001038514035906483),
#> ('Vinyl Isocyanide', 0.0009273909477218936),
#> ('iso-propyl cyanide', 0.0008605087284565887),
#> ('Nitric acid', 2.422384279594392e-05),
#> ('(Z)-Cyanovinylacetylene', 3.1447455510536433e-06),
#> ('Vinyl Cyanide', 3.966180009588776e-07),
#> ('Cyanoallene', 1.9872345977895267e-07),
#> ('Butyronitrile', 1.6043680634860682e-07),
#> ('Dimethyl ether', 6.497156324498979e-08),
#> ('Acetaldehyde', 2.4464085516366753e-08),
#> ('Cyclopropynylidyne', 1.924516228614758e-08),
#> ('Ethanol, gauche/anti combined', 1.7689270578734603e-08),
#> ('Hydroxyacetone', 1.5315561323259803e-08),
#> ('trans-Ethanol', 8.527885620594696e-09),
#> ('gauche-Ethanol', 8.527885620594696e-09)]
#> [{('(Z)-Cyanovinylacetylene', 3.1447455510536433e-06),
#> ('<i>cyclo</i>-Propyl cyanide', 0.0011417077711069284),
#> ('Aminoethanol', 0.2857142277663929),
#> ('Butyronitrile', 1.6043680634860682e-07),
#> ('Chlorine peroxide', 0.011370695145544453),
#> ('Dimethyl ether', 6.497156324498979e-08),
#> ('Hydroxyl radical', 1.0),
#> ('Methyl Acetylene', 0.9161703896648132),
#> ('Methyl Cyanide', 1.0),
#> ('Peroxynitric acid', 0.15803021057632077),
#> ('Phenol', 0.032381647962831904),
#> ('Phenylacetylene', 0.015298323848510771),
#> ('Propane', 0.9999999999583432),
#> ('Propenoic acid', 0.09555601219383479),
#> ('Sulfuric acid', 0.0026797163812313967),
#> ('Tetrasulfur', 0.03850043867000284),
#> ('Vinyl Cyanide', 3.966180009588776e-07)},
#> 0.2680498315712268]
We can visualize this with fig.show(), just like matplotlib: